Hyperbolic estimation of sparse models from erratic data
نویسندگان
چکیده
We have developed a hyperbolic penalty function for image estimation. The center of a hyperbola is parabolic like that of an l2 norm fitting. Its asymptotes are similar to l1 norm fitting. A transition threshold must be chosen for regression equations of data fitting and another threshold for model regularization. We combined two methods: Newton’s and a variant of conjugate gradient method to solve this problem in a manner we call the hyperbolic conjugate direction (HYCD) method. We tested examples of (1) velocity transform with strong noise (2) migration of aliased data, and (3) blocky interval velocity estimation. For the linear experiments we performed in this study, nonlinearity is introduced by the hyperbolic objective function, but the convexity of the sum of the hyperbolas assures the convergence of gradient methods. Because of the sufficiently reliable performance obtained on the three mainstream geophysical applications, we expect the HYCD solver method to become our default method. INTRODUCTION In the world of geophysics, conjugate gradient methods are widely used for their simplicity, reliability, and fast convergence. Traditionally, we use l2 norm to measure the data fitting and modeling regularization. When least-squares (l2) data-fitting is changed to least absolute values (l1) data-fitting, infinite outliers may be tolerated. This is called “robustness” (Huber, 1964; Claerbout and Muir, 1973; Darche, 1989; Nichols, 1994; Guitton, 2005; Candés et al., 2006). At the same time, model regularization using l1 norm leads to sparse models. (Valenciano et al., 2004; Donoho, 2006b). Despite numerous l1 optimization algorithms and their applications in the community of compressive sensing and computer science (Schmidt et al., 2007; Candés et al., 2006; Donoho, 2006a), we realize that for most of the geophysical applications, pure l1-norm objective function is not desirable because tiny residuals always have as large an effect as giant ones. Instead, we seek merely to preserve the desirable l1 characteristics to solutions of large problems such as image estimation. This led us to consider the hyperbolic penalty function that is l2-like for small residuals and l1-like for large ones. This penalty function has also been called the “hybrid norm” (Bube and Langan, 1997). Previously, we solved problems requiring robustness and sparseness by the method of iteratively reweighted least squares (IRLS) (Gersztenkom et al., 1986; Guitton and Verschuur, 2004; Daubechies et al., 2010), a method that is cumbersome because parameters related to numerical analysis are required, although we have little theoretical guidance how to choose them, with each application requiring experimentation to learn. Another widely used standard optimization package for large-scale optimization problem, limited memory variation of the Broyden–Fletcher–Goldfarb–Shanno (L-BFGS) algorithm (Liu and Nocedal, 1989) has recently included the Orthant-Wise limited-memory quasi-Newton (QWL-QN) method (Andrew and Gao, 2007) to meet the needs for l1-type regularization in the model space. However, L-BFGS requires a differentiable function to measure the data fitting, which makes l1 data fitting objective not welcomed by this family of methods. Our experience shows that we need two different hyperbolic penalty functions, one for the data fitting objective function, the other for the model styling objective function. In this paper, we use the terminology — model styling — instead of model regularization to honor the subjectivity when choosing the regularizor. Each objective function requires a threshold of residual, let us call it Rd for the data fitting, and Rm for the model styling. Instead of being the result of numerical analysis, the meaning of the thresholds Rd and Rm is quite physical. Here are two examples: For a shot gather with about 30% of the area saturated with ground roll, choose Rd around the 70th percentile of the fitting residual. Sometimes, geologists prefer earth to be blocky with different lithologies. Therefore, we seek earth models that are as blocky as the geological requirements. In other words, we seek earth models whose Manuscript received by the Editor 11 March 2011; revised manuscript received 3 August 2011; published online 1 February 2012. Stanford University, Department of Geophysics, Stanford, California, USA. E-mail: [email protected]; [email protected]; jon@ sep.stanford.edu. © 2012 Society of Exploration Geophysicists. All rights reserved. V1 GEOPHYSICS. VOL. 77, NO. 1 (JANUARY-FEBRUARY 2012); P. V1–V9, 8 FIGS. 10.1190/GEO2011-0099.1 Downloaded 12 Oct 2012 to 8.22.58.231. Redistribution subject to SEG license or copyright; see Terms of Use at http://segdl.org/ derivatives are spiky. For blocks about 20 mesh points long the spikes should average about 20 points apart. Thus about 95% of the residuals should be in the l2 area with only about 5% in the l1 area, allowing 5% of the spikes to be of unlimited size. This is an Rm at about the 95th percentile of model styling residual. The subjectively best Rd and Rm can be found within a limited interval around these physical interpretations. These examples also enable us to conclude that in a wide variety of practical examples fitting goals for data and model need not go far from the usual l2 norm, but they do need to incorporate some residual values out in the l1 zone, possibly very far out in it. In our paper, we propose a new numerical method inspired by two old ones, Newton’s and a variant of conjugate gradients (known as conjugate directions). Because the objective function is in general defined by two different hyperbolas, we name our method hyperbolic conjugate direction (HYCD) method. HYCD keeps the simplicity in methodology of the conjugate gradients methods, and only adds a little bit of cost to each conjugate direction iteration. The convexity of the hyperbolas assures the convergence. Experiments on three different applications: (1) velocity transform with strong noise, (2) migrating-aliased data, and (3) blocky interval velocity estimation demonstrate the utility and robustness of our HYCD solver.
منابع مشابه
Robust Estimation in Linear Regression with Molticollinearity and Sparse Models
One of the factors affecting the statistical analysis of the data is the presence of outliers. The methods which are not affected by the outliers are called robust methods. Robust regression methods are robust estimation methods of regression model parameters in the presence of outliers. Besides outliers, the linear dependency of regressor variables, which is called multicollinearity...
متن کاملEstimation of Value at Risk (VaR) Based On Lévy-GARCH Models: Evidence from Tehran Stock Exchange
This paper aims to estimate the Value-at-Risk (VaR) using GARCH type models with improved return distribution. Value at Risk (VaR) is an essential benchmark for measuring the risk of financial markets quantitatively. The parametric method, historical simulation, and Monte Carlo simulation have been proposed in several financial mathematics and engineering studies to calculate VaR, that each of ...
متن کاملImproved Channel Estimation for DVB-T2 Systems by Utilizing Side Information on OFDM Sparse Channel Estimation
The second generation of digital video broadcasting (DVB-T2) standard utilizes orthogonal frequency division multiplexing (OFDM) system to reduce and to compensate the channel effects by utilizing its estimation. Since wireless channels are inherently sparse, it is possible to utilize sparse representation (SR) methods to estimate the channel. In addition to sparsity feature of the channel, the...
متن کاملLarge-scale Inversion of Magnetic Data Using Golub-Kahan Bidiagonalization with Truncated Generalized Cross Validation for Regularization Parameter Estimation
In this paper a fast method for large-scale sparse inversion of magnetic data is considered. The L1-norm stabilizer is used to generate models with sharp and distinct interfaces. To deal with the non-linearity introduced by the L1-norm, a model-space iteratively reweighted least squares algorithm is used. The original model matrix is factorized using the Golub-Kahan bidiagonalization that proje...
متن کاملHyperbolic Cosine Log-Logistic Distribution and Estimation of Its Parameters by Using Maximum Likelihood Bayesian and Bootstrap Methods
In this paper, a new probability distribution, based on the family of hyperbolic cosine distributions is proposed and its various statistical and reliability characteristics are investigated. The new category of HCF distributions is obtained by combining a baseline F distribution with the hyperbolic cosine function. Based on the base log-logistics distribution, we introduce a new di...
متن کاملSparse Structured Principal Component Analysis and Model Learning for Classification and Quality Detection of Rice Grains
In scientific and commercial fields associated with modern agriculture, the categorization of different rice types and determination of its quality is very important. Various image processing algorithms are applied in recent years to detect different agricultural products. The problem of rice classification and quality detection in this paper is presented based on model learning concepts includ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012